Introducation:

The primary objectives are to explore (1)the association between disability prevalence and age, and (2)the association between prevalence of different types of disability across the US, especially between cognitive disability and mobidity disability.

Background of the raw dataset

The data source for this program is the Disability and Health Data System (DHDS) 2018, which is an online data source providing the prevalence of adult disabilities in region/state level in the US in 2018. Prevalence data of each region/state on 6 functional disability types, including cognitive (serious difficulty concentrating, remembering or making decisions), hearing (serious difficulty hearing or deaf), mobility (serious difficulty walking or climbing stairs), vision (serious difficulty seeing), self-care (difficulty dressing or bathing) and independent living (difficulty doing errands alone),are reported by age group, race, gender, and veteran status, respectively. Please note that all datapoints in the data source are actual states/regions.

Several specific questions were addressed.

  • Is the prevalence of distributed evenly across states?
  • On average, what is the mean prevalence for each type of disability in different age groups?
  • Within each age group, which state has the highest/lowest overall prevalence of any disability? How about cognitive disability and mobidity disability?
  • Within each age group, what type of disability is the most prevalent across all the states?
  • Is there an association between age and prevalence of disability?
  • Is there an association between prevalence of cognitive disability and mobidity disability?

Methods:

Data Source

  • The raw data was downloaded from the Center of Disease Control and Prevention(https://data.cdc.gov/Disability-Health/DHDS-Prevalence-of-Disability-Status-and-Types-by-/qjg3-6acf).
  • There are 7168 rows and 31 coloumns in the raw dataset; each raw gives a piece of information on state/region level.
  • The raw dataset is a long dataset, with the state- or region-level prevalence data of different type of disbility by “reponse type” (e.g. age, race, gender, veteran status) reported in each row.
  • The prevalence of the 6 disability types doesn’t add up equal to the prevalence of any disability, the potential explanation is that a proportion of people might have multiple conditions.

Data Preparation

  • R package “data table”, “dplyr”, and “dplyer” were mainly used to inspect and clean the data to create a final dataset for further analysis.
  • To obtain a final dataset that is tailored for answering the research questions of interest, I only kept relavant rows for which the ‘response type’ were age. Some variables were renamed for easier reference.
  • For future comparisons between prevalence of different types of disabilities, the dataset was reshaped from long to wide, with the prevalence values of different disabilities listed as seperate colounms for each state/region by age.
  • Comparing the prevalence values of different types of disability, a new categorical variable was created to record the disability type that has the greatest prevalence of each state/region by age.
  • In the final main dataset for analysis, there are 162 rows/observations (each row gives the statistics of each state/region by age groups) and 13 coloumns/variables of interest.

Exploratory Data Analysis

  • In the final dataset for analysis, there is no missing data on key variabels of interest (prevalence values for different types of disability).
  • The distribution of prevalence of any disease is not normally distributed. With a mean of 30.92%, most data points are concentrated between 15-25% and 40-45%. Except for hearing and mobilidity disabilities, the distribution of all other types of disability is normal. The mean of prevalence of mobility disability is the highest (16.6%) whereas it’s the lowest for self-care disability (4.365%).
  • For age 18-44, the mean and median of prevalence of any disability are 18.86% and 18.7%; regardless of the type, disability is the least prevalent in DC (12.9%) whereas it’s most prevalent in Puerto Rico(29.3%). For age 45-65, the mean and median of prevalence of any disability are 29.77% and 28.1%; disability of any disease is the least prevalent in Colorado (20.6%) whereas it’s the most prevalent in Puerto Rico(53.3%). For 65+, the mean and median are 44.13% and 43%; disability of any disease is the least prevalent in Colorado (32.2%) whereas it’s the most prevalent in Puerto Rico (62.8%).
  • Those preliminary results were validated with the external reports from CDC that 26% of the population in the US have some type of disability(https://www.cdc.gov/ncbddd/disabilityandhealth/infographic-disability-impacts-all.html). Unfortunately, the raw dataset doesn’t provide a way to weight the data by age, so we are unable to generate a weighted overall average of prevalence of any disability.

Results

Table of Disability Prevalence by Age (%)

Age Any disability Cognitive disability Hearing disability Mobidity disability Vision disability Self care disability Independent disability
18-44 18.86296 12.01852 2.580392 4.754717 3.348077 1.911628 5.454717
45-64 29.77407 12.40556 7.268518 17.700000 6.622222 5.425926 8.290741
65+ 44.13148 10.18333 17.372222 26.959259 7.796296 5.273585 9.820370

Boxplot for disability prevalence by age

For age of 18-44, The mean (range) of the prevalence of any disability across the nation is 18.7%(12.9-29.3%) for age of 18-44, 28.1% (20.6-53.3%) for age of 45-65, and 40.3% (32.2-62.8%) for aged over 65.

Distribution of disability prevalence by age

Please note that each dot in the graph is an actual state/region; the data could be quite spread out due to the small sample size

  • For the prevalence of any disability, it is normally distributed for age 18-44 and age over 65 whereas it’s not normally distributed for age of 44-65.
  • The prevalence of any disability is concentrated on the left for age 18-44, with the most prevalent value being 20.6% (N=10 states/regions).
  • The prevalence of any disability is concentrated in the middle for age 45-64, with the most prevalent value being 39.6% (N=10 states/regions).
  • The prevalence of any disability is concentrated on the right for age over 65, with the most prevalent value being 43% (N=10 states/resgions).

Please find the interactive graphs for the distribution of cognitive and mobidity disability presented on my website.

  • For age 18-44, the prevalence of cognitive disability is all under 10%, most states/regions are under 5% (N=25).
  • For age 45-64,the most prevalent value being around 12.5% and around 16% (N=9 states/regions for each).
  • Fpr age > 65, the prevalence of cognitive disability is concentrated between 22% to 28%, with the most prevalent value being around 27% (N=9 states/resgions).
  • The distribution of prevalence of mobidity disability is more concentrated in comparative to that of cognitive disability.
  • For age 18-44, the prevalence of cognitive disability is all under 10%, most states/regions are under 5% (N=25).
  • For age 45-64,the most prevalent value being around 12.5% and around 16% (N=9 states/regions for each).
  • The prevalence of cognitive disability is concentrated between 22% to 28%, with the most prevalent value being around 27% (N=9 states/resgions).

What is the most prevalent disability?

The most prevalent disability across the nation is cognitive disability for the young population (aged 18-44) in all states, and mobidity disability is the most prevalent for older population (please refer to the bar chart presented on my website).

Association between cognitive and mobidity disability

  • A positive association between prevalence of cognitive disability and mobidity disability is observed in all age groups.
  • The slope is the flattest in the younger population (aged 18-44), and it gets sharper in older population (aged over 45).

Geographic distribution of disability prevalence by age

  • The prevalent of disability is not evenly distributed across the US in all ages.
  • The percentage of people living with disabilities is highest in the South region in the US, especially in Kentucky, West Virginia, Mississippi. The situation in West Virginia is the most excessive, the prevalence of any disability is 25.8% for age of 18-44, 48.4% for 45-65, and 61.1% for aged 65+.

Conclusion: